Effects of Memory Sharing on Contemporary Processor Architectures⋆

نویسندگان

Vlastimil Babka

Petr Tuma

چکیده

Memory subsystems of contemporary processor architectures are typically equipped with a multitude of caches, which make the behavior of the memory subsystem difficult to anticipate especially when the subsystem is shared by multiple running applications. The paper presents early experimental results that dispel some preconceived notions about the memory subsystem, with applications in system design and performance engineering.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Latencies of Conflicting Writes on Contemporary Multicore Architectures

This paper provides a detailed investigation of latency penalties caused by repeated memory writes to nearby memory cells from different threads in parallel programs. When such writes map to the same corresponding cache lines in multiple processors, one can observe the so called false sharing effect. This effect can unnecessarily hamper parallel code due to the line granularity based cache hier...

متن کامل

Ultra-Low-Energy DSP Processor Design for Many-Core Parallel Applications

Background and Objectives: Digital signal processors are widely used in energy constrained applications in which battery lifetime is a critical concern. Accordingly, designing ultra-low-energy processors is a major concern. In this work and in the first step, we propose a sub-threshold DSP processor. Methods: As our baseline architecture, we use a modified version of an existing ultra-low-power...

متن کامل

Design Challenges of Scalable Operating Systems for Many-Core Architectures

Computers will move from the multi-core reality of today to manycore. Instead of only a few cores on a chip, we will have thousands of cores available for use. This new architecture will force engineers to rethink OS design. It is the only way for operating systems to remain scalable even as the number of cores increases. Presented here are three design challenges of operating systems for many-...

متن کامل

A High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure

The IP Lookup Process is a key bottleneck in routing due to the increase in routing table size, increasing traıc and migration to IPv6 addresses. The IP address lookup involves computation of the Longest Prefix Matching (LPM), which existing solutions such as BSD Radix Tries, scale poorly when traıc in the router increases or when employed for IPv6 address lookups. In this paper, we describe a ...

متن کامل